Fast, Accurate, and Scalable Method for Sparse Coupled Matrix-Tensor Factorization

نویسندگان

  • Dongjin Choi
  • Jun-Gi Jang
  • U. Kang
چکیده

How can we capture the hidden properties from a tensor and a matrix data simultaneously? Coupled matrix-tensor factorization (CMTF) is an e‚ective method to solve this problem because it extracts latent factors from a tensor and matrices at once. Designing an ecient CMTF has become more crucial as the size and dimension of real-world data are growing explosively. However, existing methods for CMTF su‚er from two problems: 1) methods for dense data are not applicable to sparse real-world data, and 2) methods based on CANDECOMP/PARAFAC (CP) decomposition su‚er from high test error because they do not capture correlations between all factors. In this paper, we propose S3CMTF, a fast, accurate, and scalable CMTF method. S3CMTF achieves high speed by exploiting the sparsity of real-world tensors, and high accuracy by capturing interrelations between factors. Also, S3CMTF accomplishes additional speed-up by lock-free parallel SGD update for multi-core shared memory systems. We present two methods, S3CMTF-naive and S3CMTF-opt. S3CMTF-naive is a basic version of S3CMTF, and S3CMTF-opt improves its speed by exploiting intermediate data. We theoretically and empirically show that S3CMTF is the fastest, outperforming existing methods. Experimental results show that S3CMTF is 11∼43× faster and 2.1∼4.1×more accurate than existing methods. S3CMTF shows linear scalability on the number of data entries and the number of cores. In addition, we apply S3CMTF to Yelp dataset with a 3-mode tensor coupled with 3 additional matrices to discover interesting paŠerns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FlexiFaCT: Scalable Flexible Factorization of Coupled Tensors on Hadoop

Given multiple data sets of relational data that share a number of dimensions, how can we efficiently decompose our data into the latent factors? Factorization of a single matrix or tensor has attracted much attention, as, e.g., in the Netflix challenge, with users rating movies. However, we often have additional, side, information, like, e.g., demographic data about the users, in the Netflix e...

متن کامل

Bayesian Factorization Machines

This work presents simple and fast structured Bayesian learning for matrix and tensor factorization models. An unblocked Gibbs sampler is proposed for factorization machines (FM) which are a general class of latent variable models subsuming matrix, tensor and many other factorization models. We empirically show on the large Netflix challenge dataset that Bayesian FM are fast, scalable and more ...

متن کامل

Scoup-SMT: Scalable Coupled Sparse Matrix-Tensor Factorization

How can we correlate neural activity in the human brain as it responds to words, with behavioral data expressed as answers to questions about these same words? In short, we want to find latent variables, that explain both the brain activity, as well as the behavioral responses. We show that this is an instance of the Coupled MatrixTensor Factorization (CMTF) problem. We propose SCOUP-SMT, a nov...

متن کامل

A social recommender system based on matrix factorization considering dynamics of user preferences

With the expansion of social networks, the use of recommender systems in these networks has attracted considerable attention. Recommender systems have become an important tool for alleviating the information that overload problem of users by providing personalized recommendations to a user who might like based on past preferences or observed behavior about one or various items. In these systems...

متن کامل

Sparse non-negative tensor factorization using columnwise coordinate descent

Many applications in computer vision, biomedical informatics, and graphics deal with data in the matrix or tensor form. Non-negative matrix and tensor factorization, which extract data-dependent non-negative basis functions, have been commonly applied for the analysis of such data for data compression, visualization, and detection of hidden information (factors). In this paper, we present a fas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1708.08640  شماره 

صفحات  -

تاریخ انتشار 2017